Linux_Essentials

The Linux Community and a Career in Open Source

Introduction

Linux is one of the most popular operating systems. Its development was started in 1991 by Linus Torvalds. The operating system was inspired by Unix, another operating system developed in the 1970s by AT&T Laboratories. Unix was geared towards small computers. At the time, “small” computers were considered machines that don’t need an entire hall with air conditioning and cost less than one million dollars. Later, they were considered as the machines that can be lifted by two people. At that time, an affordable Unix system was not readily available on computers such as office computers, which were tended to be based on the x86 platform. Therefore Linus, who was a student by that time, started to implement a Unix-like operating system which was supposed to run on this platform. Mostly, Linux uses the same principles and basic ideas of Unix, but Linux itself doesn’t contain Unix code, as it is an independent project. Linux is not backed by an individual company but by an international community of programmers. As it is freely available, it can be used by anyone without restrictions

Distributions

A Linux distribution is a bundle that consists of a Linux kernel and a selection of applications that are maintained by a company or user community. A distribution’s goal is to optimize the kernel and the applications that run on the operating system for a certain use case or user group. Distributions often include distribution-specific tools for software installation and system administration. This is why some distributions are mainly used for desktop environments where they need to be easy to use while others are mainly used to run on servers to use the available resources as efficiently as possible. Another way to classify distributions is by referring to the distribution family they belong to. Distributions of the Debian distribution family use the package manager dpkg to manage the software that is run on the operating system. Packages that can be installed with the package manager are maintained by voluntary members of the distribution’s community. Maintainers use the deb package format to specify how the software is installed on the operating system and how it is configured by default. Just like a distribution, a package is a bundle of software and a corresponding configuration and documentation that makes it easier for the user to install, update and use the software. The Debian GNU/Linux distribution is the biggest distribution of the Debian distribution family. The Debian GNU/Linux Project was launched by Ian Murdock in 1993. Today thousands of volunteers are working on the project. Debian GNU/Linux aims to provide a very reliable operating system. It also promotes Richard Stallman’s vision of an operating system that respects the freedoms of the user to run, study, distribute and improve the software. This is why it does not provide any proprietary software by default. Ubuntu is another Debian-based distribution worth mentioning. Ubuntu was created by Mark Shuttleworth and his team in 2004, with the mission to bring an easy to use Linux desktop environment. Ubuntu’s mission is to provide a free software to everyone across the world as well as to cut the cost of professional services. The distribution has a scheduled release every six months with a long-term support release every 2 years. Red Hat is a Linux distribution developed and maintained by the identically named software company, which was acquired by IBM in 2019. The Red Hat Linux distribution was started in 1994 and re-branded in 2003 to Red Hat Enterprise Linux, often abbreviated as RHEL. It is provided to companies as a reliable enterprise solution that is supported by Red Hat and comes with software that aims to ease the use of Linux in professional server environments. Some of its components require fee-based subscriptions or licenses. The CentOS project uses the freely available source code of Red Hat Enterprise Linux and compiles it to a distribution which is available completely free of charge, but in return does not come with commercial support.

Both RHEL and CentOS are optimized for use in server environments. The Fedora project was founded in 2003 and creates a Linux distribution which is aimed at desktop computers. Red Hat initiated and maintains the Fedora distribution ever since. Fedora is very progressive and adopts new technologies very quickly and is sometimes considered a test-bed for new technologies which later might be included in RHEL. All Red Hat based distributions use the package format rpm. The company SUSE was founded in 1992 in Germany as a Unix service provider. The first version of SUSE Linux was released in 1994. Over the years SUSE Linux became mostly known for its YaST configuration tool. This tool allows administrators to install and configure software and hardware, set up servers and networks. Similar to RHEL, SUSE releases SUSE Linux Enterprise Server, which is their commercial edition. This is less frequently released and is suitable for enterprise and production deployment. It is distributed as a server as well as a desktop environment, with fit-for-purpose packages. In 2004, SUSE released the openSUSE project, which opened opportunities for developers and users to test and further develop the system. The openSUSE distribution is freely available to download. Independent distributions have been released over the years. Some of them are based on either Red Hat or Ubuntu, some are designed to improve a specific propriety of a system or hardware. There are distributions built with specific functionalities like QubesOS, a very secure desktop environment, or Kali Linux, which provides an environment for exploiting software vulnerabilities, mainly used by penetration testers. Recently various super small Linux distributions are designed to specifically run in Linux containers, such as Docker. There are also distributions built specifically for components of embedded systems and even smart devices.

Embedded Systems

Embedded systems are a combination of computer hardware and software designed to have a specific function within a larger system. Usually they are part of other devices and help to control these devices. Embedded systems are found in automotive, medical and even military applications. Due to its wide variety of applications, a variety of operating systems based on the Linux kernel was developed in order to be used in embedded systems. A significant part of smart devices have a Linux kernel based operating system running on it. Therefore, with embedded systems comes embedded software. The purpose of this software is to access the hardware and make it usable. The major advantages of Linux over any proprietary embedded software include cross vendor platform compatibility, development, support and no license fees. Two of the most popular embedded software projects are Android, that is mainly used on mobile phones across a variety of vendors and Raspbian, which is used mainly on Raspberry Pi

Android

Android is mainly a mobile operating system developed by Google. Android Inc. was founded in 2003 in Palo Alto, California. The company initially created an operating system meant to run on digital cameras. In 2005, Google bought Android Inc. and developed it to be one of the biggest mobile operating systems. The base of Android is a modified version of the Linux kernel with additional open source software. The operating system is mainly developed for touchscreen devices, but Google has developed versions for TV and wrist watches. Different versions of Android have been developed for game consoles, digital cameras, as well as PCs. Android is freely available in open source as Android Open Source Project (AOSP). Google offers a series of proprietary components in addition to the open source core of Android. These components include applications such as Google Calendar, Google Maps, Google Mail, the Chrome browser as well as the Google Play Store which facilitates the easy installation of apps. Most users consider these tools an integral part of their Android experience. Therefore almost all mobile devices shipped with Android in Europe and America include proprietary Google software. Android on embedded devices has many advantages. The operating system is intuitive and easy to use with a graphical user interface, it has a very wide developer community, therefore it is easy to find help for development. It is also supported by the majority of the hardware vendors with an Android driver, therefore it is easy and cost effective to prototype an entire system.

Raspbian and the Raspberry Pi

Raspberry Pi is a low cost, credit-card sized computer that can function as a full-functionality desktop computer, but it can be used within an embedded Linux system. It is developed by the Raspberry Pi Foundation, which is an educational charity based in UK. It mainly has the purpose to teach young people to learn to program and understand the functionality of computers. The Raspberry Pi can be designed and programmed to perform desired tasks or operations that are part of a much more complex system. The specialties of the Raspberry Pi include a set of General Purpose Input-Output (GPIO) pins which can be used to attach electronic devices and extension boards. This allows using the Raspberry Pi as a platform for hardware development. Although it was intended for educational purposes, Raspberry Pis are used today in various DIY projects as well as for industrial prototyping when developing embedded systems. The Raspberry Pi uses ARM processors. Various operating systems, including Linux, run on the Raspberry Pi. Since the Raspberry Pi does not contain a hard disk, the operating system is started from an SD memory card. One of the most prominent Linux distributions for the Raspberry Pi is

Raspbian. As the name suggests, it belongs to the Debian distribution family. It is customized to be installed on the Raspberry Pi hardware and provides more than 35000 packages optimized for this environment. Besides Raspbian, numerous other Linux distributions exist for the Raspberry Pi, like, for example, Kodi, which turns the Raspberry Pi into a media center.

Linux and the Cloud

The term cloud computing refers to a standardized way of consuming computing resources, either by buying them from a public cloud provider or by running a private cloud. As of 2017 reports, Linux runs 90% of the public cloud workload. Every cloud provider, from Amazon Web Services (AWS) to Google Cloud Platform (GCP), offers different forms of Linux. Even Microsoft offers Linux-based virtual machines in their Azure cloud today. Linux is usually offered as part of Infrastructure as a Service (IaaS) offering. IaaS instances are virtual machines which are provisioned within minutes in the cloud. When starting an IaaS instance, an image is chosen which contains the data that is deployed to the new instance. Cloud providers offer various images containing ready to run installations of both popular Linux distributions as well as own versions of Linux. The cloud user chooses an image containing their preferred distribution and can access a cloud instance running this distribution shortly after. Most cloud providers add tools to their images to adjust the installation to a specific cloud instance. These tools can, for example, extend the file systems of the image to fit the actual hard disk of the virtual machine

1.2 Major Open Source Applications

Introduction

An application is a computer program whose purpose is not directly tied to the inner workings of the computer, but with tasks performed by the user. Linux distributions offer many application options to perform a variety of tasks, such as office applications, web browsers, multimedia players and editors, etc. There is often more than one application or tool to perform a particular job. It is up to the user to choose the application which best fits their needs.

Software Packages

Almost every Linux distribution offers a pre-installed set of default applications. Besides those pre-installed applications, a distribution has a package repository with a vast collection of applications available to install through its package manager. Although the various distributions offer roughly the same applications, several different package management systems exist for various distributions. For instance, Debian, Ubuntu and Linux Mint use the dpkg, apt-get and apt tools to install software packages, generally referred as DEB packages. Distributions such as Red Hat, Fedora and CentOS use the rpm, yum and dnf commands instead, which in turn install RPM packages. As the application packaging is different for each distribution family, it is very important to install packages from the correct repository designed to the particular distribution. The end user usually doesn’t have to worry about those details, as the distribution’s packagemanager will choose the right packages, the required dependencies and future updates.Dependencies are auxiliary packages needed by programs. For example, if a library provides functions for dealing with JPEG images which are used by multiple programs, this library is likely packaged in its own package on which all applications using the library depend. The commands dpkg and rpm operate on individual package files. In practice, almost all package management tasks are performed by the commands packages or by yum or apt-get or apt on systems that use DEB dnf on systems that use RPM packages. These commands work with catalogues of packages, can download new packages and their dependencies, and check for newer versions of the installed packages

Package Installation

Suppose you have heard about a command called figlet: figlet which prints enlarged text on the terminal and you want to try it. However, you get the following message after executing the command

$ figlet-bash: figlet: command not found

That probably means the package is not installed on your system. If your distribution works with DEB packages, you can search its repositories using apt-cache search package_name or search package_name. The apt apt-cache command is used to search for packages and to list information about available packages. The following command looks for any occurrences of the term “figlet” in the package’s names and descriptions:

apt-cache search figlet

figlet - Make large character ASCII banners out of ordinary text

The search identified a package called figlet that corresponds to the missing command. The installation and removal of a package require special permissions granted only to the system’s administrator: the user named root. On desktop systems, ordinary users can install or remove packages by prepending the command sudo to the installation/removal commands. That will require you to type your password to proceed. For DEB packages, the installation is performed with the command apt-get install package_name or apt install package_name:

$ sudo apt-get install figlet

In distributions based on RPM packages, searches are performed using yum search package_name or dnf search package_name. Let’s say you want to display some text in a more irreverent way, followed by a cartoonish cow, but you are not sure about the package that can perform that task. As with the DEB packages, the RPM search commands accept descriptive terms:

$ yum search speaking cow

After finding a suitable package at the repository, it can be installed with yum install package_name or dnf install package_name:

Package Removal

The same commands used to install packages are used to remove them. All the commands accept the remove keyword to uninstall an installed package: apt-get remove package_name or apt remove package_name for DEB packages and yum remove package_name or package_name for RPM packages. The dnf remove sudo command is also needed to perform the removal. For example, to remove the previously installed package figlet from a DEB-based distribution

$ sudo apt-get remove figlet

A similar procedure is performed on an RPM-based system. For example, to remove the previously installed package cowsay from an RPM-based distribution

The configuration files of the removed packages are kept on the system, so they can be used again if the package is reinstalled in the future.

Office Applications

Office applications are used for editing files such as texts, presentations, spreadsheets and other formats commonly used in an office environment. These applications are usually organised in collections called office suites.

For a long time, the most used office suite in Linux was the OpenOffice.org suite. OpenOffice.org was an open source version of the StarOffice suite released by Sun Microsystems. A few years later Sun was acquired by Oracle Corporation, which in turn transferred the project to the Apache Foundation and OpenOffice.org was renamed to Apache OpenOffice. Meanwhile, another office suite based on the same source code was released by the Document Foundation, which named it LibreOffice. The two projects have the same basic features and are compatible with the document formats from Microsoft Office. However, the preferred document format is the Open Document Format, a fully open and ISO standardized file format. The use of ODF files ensures that documents can be transferred between operating systems and applications from different vendors, such as Microsoft Office. The main applications offered by OpenOffice/LibreOffice are

Writer

Text editor

Calc

Spreadsheets

Impress

Presentations

Draw

Vector drawing

Math

Math formulas

Base

Database

Both LibreOffice and Apache OpenOffice are open source software, but LibreOffice is licensed under LGPLv3 and Apache OpenOffice is licensed under Apache License 2.0. The licensing distinction implies that LibreOffice can incorporate improvements made by Apache OpenOffice, but Apache OpenOffice cannot incorporate improvements made by LibreOffice. That, and a more active community of developers, are the reason most distributions adopt LibreOffice as their default office suite.

Web Browsers

For most users, the main purpose of a computer is to provide access to the Internet. Nowadays, web pages can act as a full featured app, with the advantage of being accessible from anywhere, without the need of installing extra software. That makes the web browser the most important application of the operating system, at least for the average user.

One of the best sources for learning about web development is the MDN Web Docs, available at https://developer.mozilla.org/. Maintained by Mozilla, the site is full of tutorials for beginners and reference materials on most modern web technologies.

The main web browsers in the Linux environment are Google Chrome and Mozilla Firefox. Chrome is a web browser maintained by Google but is based on the open source browser named Chromium, which can be installed using the distribution’s package manager and is fully compatible with Chrome. Maintained by Mozilla, a non-profit organization, Firefox is a browser whose origins are linked to Netscape, the first popular web browser to adopt the open source model. The Mozilla Foundation is deeply involved with the development of the open standards underlying the modern web

Mozilla also develops other applications, like the e-mail client Thunderbird. Many users opt to use webmail instead of a dedicated email application, but a client like Thunderbird offers extra features and integrates best with other applications on the desktop

Multimedia

Compared to the available web applications, desktop applications are still the best option for the creation of multimedia content. Multimedia related activities like video rendering often require high amounts of system resources, which are best managed by a local desktop application. Some of the most popular multimedia applications for the Linux environment and their uses are listed below

Server Programs

When a web browser loads a page from a website, it actually connects to a remote computer and asks for a specific piece of information. In that scenario, the computer running the web browser is called the client and the remote computer is called the server. The server computer, which can be an ordinary desktop computer or specialized hardware, needs a specific program to manage each type of information it will provide. Regarding serving web pages, most servers around the world deploy open source server programs. This particular server program is called an HTTP server (HTTP stands for Hyper Text Transfer Protocol) and the most popular ones are Apache, Nginx and lighttpd. Even simple web pages may require many requests, which can be ordinary files — called static content — or dynamic content rendered from various sources. The role of an HTTP server is to collect and send all the requested data back to the browser, which then arranges the content as defined by the received HTML document (HTML stands for Hyper Text Markup Language) and other supporting files. Therefore, the rendering of a web page involves operations performed on the server side and operations performed on the client side. Both sides can use custom scripts to accomplish specific tasks. On the HTTP server side, it is quite common to use the PHP scripting language. JavaScript is the scripting language used on the client side (the web browser)

Server programs can provide all kinds of information. It’s not uncommon to have a server program requesting information provided by other server programs. That is the case when an HTTP server requires information provided by a database server. For instance, when a dynamic page is requested, the HTTP server usually queries a database to collect all the required pieces of information and sends the dynamic content back to the client. In a similar way, when a user registers on a website, the HTTP server gathers the data sent by the client and stores it in a database. A database is an organized set of information. A database server stores contents in a formatted fashion, making it possible to read, write and link large amounts of data reliably and at great speed. Open source database servers are used in many applications, not only on the Internet. Even local applications can store data by connecting to a local database server. The most common type of database is the relational database, where the data is organized in predefined tables. The most popular open source relational databases are MariaDB (originated from MySQL) and PostgreSQL.

Data Sharing

In local networks, like the ones found in offices and homes, it is desirable that computers not only should be able to access the Internet, but also should be able to communicate with each other. Sometimes a computer acts as a server, sometimes the same computer acts as a client. That is necessary when one wants to access files on another computer in the network — for instance, access a file stored on a desktop computer from a portable device — without the hassle of copying it to a USB drive or the like. Between Linux machines, NFS (Network File System) is often used. The NFS protocol is the standard way to share file systems in networks equipped only with Unix/Linux machines. With NFS, a computer can share one or more of its directories with specific computers on the network, so they can read and write files in these directories. NFS can even be used to share an entire operating system’s directory tree with clients that will use it to boot from. These computers, called thin clients, are mostly often used in large networks to avoid the maintenance of each individual operating system on each machine. If there are other types of operating systems attached to the network, it is recommended to use a data sharing protocol that can be understood by all of them. This requirement is fulfilled by Samba. Samba implements a protocol for sharing files over the network originally made for the Windows operating system, but today is compatible with all major operating systems. With Samba, computers in the local network not only can share files, but also printers

On some local networks, the authorization given upon login on a workstation is granted by a central server, called the domain controller, which manages the access to various local and remote resources. The domain controller is a service provided by Microsoft’s Active Directory. Linux workstations can associate with a domain controller by using Samba or an authentication subsystem called SSSD. As of version 4, Samba can also work as a domain controller on heterogeneous networks.

If the goal is to implement a cloud computing solution able to provide various methods of web based data sharing, two alternatives should be considered: ownCloud and Nextcloud. The two projects are very similar because Nextcloud is a spin-off of ownCloud, which is not unusual among open source projects. Such spin-offs are usually called a fork. Both provide the same basic features: file sharing and sync, collaborative workspaces, calendar, contacts and mail, all through desktop, mobile and web interfaces. Nextcloud also provides private audio/video conferencing, whilst ownCloud is more focused on file sharing and integration with third-party software. Many more features are provided as plugins which can be activated later as needed. Both ownCloud and Nextcloud offer a paid version with extra features and extended support. What makes them different from other commercial solutions is the ability to install Nextcloud or ownCloud on a private server, free of charge, avoiding keeping sensitive data on an unknown server. As all the services depend on HTTP communication and are written in PHP, the installation must be performed on a previous configured web server, like Apache. If you consider installing ownCloud or Nextcloud on your own server, make sure to also enable HTTPS to encrypt all connections to your cloud.

Network Administration

Communication between computers is only possible if the network is working correctly. Normally, the network configuration is done by a set of programs running on the router, responsible for setting up and checking the network availability. In order to achieve this, two basic network services are used: DHCP (Dynamic Host Configuration Protocol) and DNS (Domain Name System). DHCP is responsible for assigning an IP address to the host when a network cable is connected or when the device enters a wireless network. When connecting to the Internet, the ISP’s DHCP server will provide an IP address to the requesting device. A DHCP server is very useful in local area networks also, to automatically provide IP addresses to all connected devices. If DHCP is not configured or if it’s not working properly, it would be necessary to manually configure the IP address of each device connected to the network. It is not practical to manually set the IP addresses on large networks or even in small networks, that’s why most network routers come with a DHCP server pre-configured by default. The IP address is required to communicate with another device on an IP network, but domain names like www.lpi.org are much more likely to be remembered than an IP number like 203.0.113.165. The domain name by itself, however, is not enough to establish the

communication through the network. That is why the domain name needs to be translated to an IP address by a DNS server. The IP address of the DNS server is provided by the ISP’s DHCP server and it’s used by all connected systems to translate domain names to IP addresses. Both DHCP and DNS settings can be modified by entering the web interface provided by the router. For instance, it is possible to restrict the IP assignment only to known devices or associate a fixed IP address to specific machines. It’s also possible to change the default DNS server provided by the ISP. Some third-party DNS servers, like the ones provided by Google or OpenDNS, can sometimes provide faster responses and extra features

Programming Languages

All computer programs (client and server programs, desktop applications and the operating system itself) are made using one or more programming languages. Programs can be a single file or a complex system of hundreds of files, which the operating system treats as an instruction sequence to be interpreted and performed by the processor and other devices. There are numerous programming languages for very different purposes and Linux systems provide a lot of them. Since open source software also includes the sources of the programs, Linux systems offer developers perfect conditions to understand, modify or create software according to their own needs. Every program begins as a text file, called source code. This source code is written in a more or less human-friendly language that describes what the program is doing. A computer processor can not directly execute this code. In compiled languages, the source code is therefore be converted to a binary file which can then be executed by the computer. A program called compiler is responsible for doing the conversion from source code to executable form. Since the compiled binary is specific to one kind of processor, the program might have to be re-compiled to run on another type of computer

In interpreted languages, the program does not need to be previously compiled. Instead, an interpreter reads the source code and executes its instruction every time the program is run. This makes the development easier and faster, but at the same time interpreted programs tend to be slower than compiled programs. Here some of the most popular programming languages:

Javascript

JavaScript is a programming language mostly used in web pages. In its origins, JavaScript applications were very simple, like form validation routines. As for today, JavaScript is considered a first class language and it is used to create very complex applications not only on

C

The C programming language is closely related with operating systems, particularly Unix, but it is used to write any kind of program to almost any kind of device. The great advantages of C are flexibility and speed. The same source code written in C can be compiled to run in different platforms and operating systems, with little or no modification. After being compiled, however, the program will run only in the targeted system.

Java

The main aspect of Java is that programs written in this language are portable, which means that the same program can be executed in different operating systems. Despite the name, Java is not related to JavaScript.

Perl

Perl is a programming language most used to process text content. It has a strong regular expressions emphasis, which makes Perl a language suited for text filtering and parsing

Shell

The shell, particularly the Bash shell, is not just a programming language, but an interactive interface to run other programs. Shell programs, known as shell scripts, can automate complex or repetitive tasks on the command line environment

Python

Python is a very popular programming language among students and professionals not directly involved with computer science. Whilst having advanced features, Python is a good way to start learning programming for its easy to use approach

PHP

PHP is most used as a server side scripting language for generating content for the web. Most online HTML pages are not static files, but dynamic content generated by the server from various sources, like databases. PHP programs — sometimes just called PHP pages or PHP scripts — are often used to generate this kind of content. The term LAMP comes from the combination of a Linux operating system, an Apache HTTP server, a MySQL (or MariaDB) database and PHP programming. LAMP servers are a very popular solution for running web servers. Besides PHP, all of the programming languages described before can be used to implement such applications too.

C and Java are compiled languages. In order to be executed by the system, source code written in C is converted to binary machine code, whereas Java source code is converted to bytecode executed in a special software environment called Java Virtual Machine. JavaScript, Perl, Shel, script, Python and PHP are all interpreted languages, which are also called scripting languages.

1.4 Lesson 1

The Linux Community and a Career in Open Source

Introduction

There was a time when working with Linux on the desktop was considered hard since the system lacked many of the more polished desktop applications and configuration tools that other operating systems had. Some of the reasons for that were that Linux was a lot younger than many other operating systems. That being said, it was easier to start by developing more essential command line applications and just leave the more complex graphical tools for later. In the beginning, since Linux was first targeted to more advanced users, that should not have been a problem. But those days are long gone. Today, Linux desktop environments are very mature, leaving nothing to be desired as regards to features and ease of use. Nevertheless, the command line is still considered a powerful tool used each and every day by advanced users. In this lesson we’ll take a look at some of the basic desktop skills you will need in order to choose the best tool for the right job, including getting to the command line

Linux User Interfaces

When using a Linux system, you either interact with a command line or with a graphical user interfaces. Both ways grant you access to numerous applications that support performing almost any task with the computer. While objective 1.2 already introduced you to a series of commonly used applications, we will start this lesson with a closer look at desktop environments, ways to access the terminal and tools used for presentations and project management.

Desktop Environments

Linux has a modular approach where different parts of the system are developed by different projects and developers, each one filling a specific need or objective. Because of that, there are several options of desktop environments to choose from and together with package managers, the default desktop environment is one of the main differences among the many distributions out there. Unlike proprietary operating systems like Windows and macOS, where the users are restricted to the desktop environment that comes with their OS, there is the possibility to install multiple environments and pick the one that adapts the most to you and your needs. Basically, there are two major desktop environments in the Linux world: Gnome and KDE. They are both very complete, with a large community behind them and aim for the same purpose but with slightly divergent approaches. In a nutshell, Gnome tries to follow the KISS (“keep it simple stupid”) principle, with very streamlined and clean applications. On the other hand, KDE has another perspective with a larger selection of applications and giving the user the opportunity to change every configuration setting in the environment

While Gnome applications are based on the GTK toolkit (written in the C language), KDE applications make use of the Qt library (written in C++). One of the most practical aspects of writing applications with the same graphical toolkit is that applications will tend to share a similar look and feel, which is responsible for giving the user a sense of unity during their experience. Another important characteristic is that having the same shared graphical library for many frequently used applications may save some memory space at the same time that it will speed up loading time once the library has been loaded for the first time.

Getting to the Command Line

For us, one of the most important applications is the graphical terminal emulator. Those are called terminal emulators because they really emulate, in a graphical environment, the old style serial terminals (often Teletype machines) that were in fact clients that used to be connected to a remote machine where the computing actually happened. Those machines were really simple computers with no graphics at all that were used on the first versions of Unix. In Gnome, such an application is called Gnome Terminal, while in KDE it can be found as Konsole. But there are many other choices available, such as Xterm. Those applications are a way for us to have access to a command line environment in order to be able to interact with a shell.

So, you should look at the application menu of your distribution of choice for a terminal application. Besides any difference between them, every application will offer you what is necessary to gain confidence in using the command line. Another way to get into the terminal is to use the virtual TTY. You can get into them by pressing Ctrl +  Alt + F#. Read F# as one of the function keys from 1 to 7, for example. Probably, some of the initial combinations might be running your session manager or your graphical environment. The others will show a prompt asking for your login name like the one below:

Presentations and Projects

The most important tool for presentations on Linux is LibreOffice Impress. It’s part of the open source office suite called LibreOffice. Think about LibreOffice as an open source replacement for the equivalent Microsoft Office. It can even open and save the PPT and PPTX files that are native to Powerpoint. But in spite of that, I really recommend you to use the native ODP Impress format. The ODP is part of the larger Open Document Format, which is a international standard for this kind of file. This is especially important if you want to keep your documents accessible for many years and worry less about compatibility problems. Because they are an open standard, it’s possible for anyone to implement the format without paying any royalties or licenses. This also makes you free to try other presentations software that you may like better and take your files with you, as it’s very likely they will be compatible with those newer softwares

But if you prefer code over graphical interfaces, there are a few tools for you to choose. Beamer is a LaTeX class that can create slide presentations from LaTeX code. LaTeX itself is a typesetting system largely used for writing scientific documents at the academy, specially for its capacity to handle complex math symbols, which other softwares have difficulty to deal with. If you are at the university and need to deal with equations and other math related problems, Beamer can save you a lot of time.

The other option is Reveal.js, an awesome NPM package (NPM is the default NodeJS package manager) which allows you to create beautiful presentations by using the web. So, if you can write HTML and CSS, Reveal.js will bring most of the JavaScript necessary to create pretty and interactive presentations that will adapt well on any resolution and screen size. Lastly, if you want a replacement for Microsoft Project, you can try GanttProject or ProjectLibre. Both are very similar to their proprietary counterpart and compatible with Project files

Industry Uses of Linux

Linux is heavily used among the software and Internet industries. Sites like W3Techs report that about 68% of the website servers on the Internet are powered by Unix and the biggest portion of those are known to be Linux. This large adoption is given not only for the free nature of Linux (as both in free beer and in freedom of speech) but also for its stability, flexibility and performance. These characteristics allow vendors to offer their services with a lower cost and a greater scalability. A significant portion of Linux systems nowadays run in the cloud, either on a IaaS (Infrastructure as a service), PaaS (Platform as a Service) or SaaS (Software as a Service) model. IaaS is a way to share the resources of a large server by offering them access to virtual machines that are, in fact, multiple operating systems running as guests on a host machine over an important piece of software that is called a hypervisor. The hypervisor is responsible for making it possible for these guest OSs to run by segregating and managing the resources available on the host machine to those guests. That’s what we call virtualization. In the IaaS model, you pay only for the fraction of resources your infrastructure uses. Linux has three well know open source hypervisors: Xen, KVM and VirtualBox. Xen is probably the oldest of them. KVM ran out Xen as the most prominent Linux Hypervisor. It has its development sponsored by RedHat and it is used by them and other players, both in public cloud services and in private cloud setups. VirtualBox belongs to Oracle since its acquisition of Sun Microsystems and is usually used by end users because of its easiness of use and administration. PaaS and SaaS, on the other hand, build up on the IaaS model, both technically and conceptually. In PaaS instead of a virtual machine, the users have access to a platform where it will be possible to deploy and run their application. The goal here is to ease the burden of dealing with system administration tasks and operating systems updates. Heroku is a common PaaS example where program code can just be run without taking care of the underlying containers and virtual machines.

Lastly, SaaS is the model where you usually pay for a subscription in order to just use a software without worrying about anything else. Dropbox and Salesforce are two good examples of SaaS

Most of these services are accessed through a web browser. A project like OpenStack is a collection of open source software that can make use of different hypervisors and other tools in order to offer a complete IaaS cloud environment on premise, by leveraging the power of computer cluster on your own datacenter. However, the setup of such infrastructure is not trivial.

Privacy Issues when using the Internet

The web browser is a fundamental piece of software on any desktop these days, but some people still lack the knowledge to use it securely. While more and more services are accessed through a web browser, almost all actions done through a browser are tracked and analyzed by various parties. Securing access to internet services and preventing tracking is an important aspect of using the internet in a safe manner

Cookie Tracking

Let’s assume you have browsed an e-commerce website, selected a product you wanted and placed that in the shopping cart. But at the last second, you have decided to give it a second thought and think a little longer if you really needed that. After a while, you start seeing ads of that same product following you around the web. When clicking on the ads, you are immediately sent to the product page of that store again. It’s not uncommon that the products you placed in the shopping cart are still there, just waiting for you to decide to check them out. Have you ever wondered how they do that? How they show you the right ad at another web page? The answer for these questions is called cookie tracking.

Cookies are small files a website can save on your computer in order to store and retrieve some kind of information that can be useful for your navigation. They have been in use for many years and are one of the oldest ways to store data on the client side. One good example of their use are unique shopping card IDs. That way, if you ever come back to the same website in a few days, the store can remember you the products you’ve placed in your cart during your last visit and save you the time to find them again. That’s usually okay, since the website is offering you a useful feature and not sharing any data with third parties. But what about the ads that are shown to you while you surf on other web pages? That’s where the ad networks come in. Ad networks are companies that offer ads for e commerce sites like the one in our example on one side, and monetization for websites, on the other side. Content creators like bloggers, for example, can make some space available for those ad networks on their blog, in exchange for a commission related to the sales generated by that ad.

But how do they know what product to show you? They usually do that by saving also a cookie from the ad network at the moment you visited or searched for a certain product on the e commerce website. By doing that, the network is able to retrieve the information on that cookie wherever the network has ads, making the correlation with the products you were interested. This is usually one of the most common ways to track someone over the Internet. The example we gave above makes use of an e-commerce to make things more tangible, but social media platforms do the same with their “Like” or “Share” buttons and their social login. One way you can get rid of that is by not allowing third party websites to store cookies on your browser. This way, only the website you visit can store their cookies. But be aware that some “legitimate” features may not work well if you do that, because many sites today rely on third party services to work. So, you can look for a cookie manager at your browser’s add-on repository in order to have a fine-grained control of which cookies are being stored on your machine.

Do Not Track (DNT)

Another common misconception is related to a certain browser configuration better known as DNT. That’s an acronym for “Do Not Track” and it can be turned on basically on any current browser. Similarly to the private mode, it’s not hard to find people that believe they will not be tracked if they have this configuration on. Unfortunately, that’s not always true. Currently, DNT is just a way for you to tell the websites you visit that you do not want them to track you. But, in fact, they are the ones who will decide if they will respect your choice or not. In other words, DNT is a way to opt-out from website tracking, but there is no guarantee on that choice. Technically, this is done by simply sending an extra flag on the header of the HTTP request protocol ( DNT: 1) upon requesting data from a web server. If you want to know more about this topic, the website https://allaboutdnt.com is good starting point.

“Private” Windows

You might have noticed the quotes in the heading above. This is because those windows are not as private as most people think. The names may vary but they can be called “private mode”, “incognito” or “anonymous” tab, depending on which browser you are using. Ctrl +  Shift + P keys. In Chrome, just press Ctrl +  In Firefox, you can easily use it by pressing What it actually does is open a brand new session, which usually doesn’t share any configuration or data from your standard profile. When you close the private window, the browser will automatically delete all the data generated by that session, leaving no trace on the computer used. This means that no personal data, like history, passwords or cookies are stored on that computer. Shift + N. Thus, many people misunderstand this concept by believing that they can browse anonymous on the Internet, which is not completely true. One thing that the privacy or incognito mode does is

avoid what we call cookie tracking. When you visit a website, it can store a small file on your computer which may contain an ID that can be used to track you. Unless you configure your browser to not accept third-party cookies, ad networks or other companies can store and retrieve that ID and actually track your browsing across websites. But since the cookies stored on a private mode session are deleted right after you close that session, that information is forever lost. Besides that, websites and other peers on the Internet can still use plenty other techniques in order to track you. So, private mode brings you some level of anonymity but it’s completely private only on the computer you are using. If you are accessing your email account or banking website from a public computer, like in an airport or a hotel, you should definitely access those using your browser’s private mode. In other situations, there can be benefits but you should know exactly what risks you are avoiding and which ones have no effect. Whenever you use a public accessible computer, be aware that other security threats such as malware or key loggers might exist. Be careful whenever you enter personal information, including usernames and passwords, on such computers or when you download or copy confidential data

Choosing the Right Password

One of the most difficult situations any user faces is choosing a secure password for the services they make use of. You have certainly heard before that you should not use common combinations like 123456 or 654321, nor easily guessable numbers like your (or a relative’s) birthday or zip code. The reason for that is because those are all very obvious combinations and the first attempts an invader will try in order to gain access to your account. There are known techniques for creating a safe password. One of the most famous is making up a sentence which reminds you of that service and picking the first letters of each word. Let’s assume I want to create a good password for Facebook, for example. In this case, I could come up with a sentence like “I would be happy if I had a 1000 friends like Mike”. Pick the first letter of each word and the final password would be IwbhiIha1000flM. This would result in a 15 characters password which is long enough to be hard to guess and easy to remember at the same time (as long as I can remember the sentence and the “algorithm” for retrieving the password). Sentences are usually easier to remember than the passwords but even this method has its limitations. We have to create passwords for so many services nowadays and as we use them with different frequencies, it will eventually be very difficult to remember all the sentences at the time we need them. So what can we do? You may answer that the wisest thing to do in this case is creating a couple good passwords and reuse them on similar services, right? Unfortunately, that’s also not a good idea. You probably also heard you should not reuse the same password among different services. The problem of doing such a thing is that a specific service may leak your password (yes, it happens all the time) and any person who have access to it will

try to use the same email and password combination on other popular services on the Internet in hope you have done exactly that: recycled passwords. And guess what? In case they are right you will end up having a problem not only on just one service but on several of them. And believe me, we tend to think it’s not going to happen to us until it’s too late. So, what can we do in order to protect ourselves? One of the most secure approaches available today is using what is called a password manager. Password managers are a piece of software that will essentially store all your passwords and usernames in an encrypted format which can be decrypted by a master password. This way you only need to remember one good password since the manager will keep all the others safe for you. KeePass is one of the most famous and feature rich open source password managers available. It will store your passwords in an encrypted file within your file system. The fact it’s open source is an important issue for this kind of software since it guarantees they will not make any use of your data because any developer can audit the code and know exactly how it works. This brings a level of transparency that’s impossible to reach with proprietary code. KeePass has ports for most operating systems, including Windows, Linux and macOS; as well as mobile ones like iOS and Android. It also includes a plugin system that is able to extend it’s functionality far beyond the defaults. Bitwarden is another open source solution that has a similar approach but instead of storing your data in a file, it will make use of a cloud server. This way, it’s easier to keep all your devices synchronized and your passwords easily accessible through the web. Bitwarden is one of the few projects that will make not only the clients, but also the cloud server available as an open source software. This means you can host your own version of Bitwarden and make it available to anyone, like your family or your company employees. This will give you flexibility but also total control over how their passwords are stored and used. One of the most important things to keep in mind when using a password manager is creating a random password for each different service since you will not need to remind them anyway. It would be worthless if you use a password manager to store recycled or easily guessable passwords. Thus, most of them will offer you a random password generator you can use to create those for you

Encryption

Whenever data is transferred or stored, precautions need to be taken to ensure that third parties may not access the data. Data transferred over the internet passes by a series of routers and networks where third parties might be able to access the network traffic. Likewise, data stored on physical media might be read by anyone who comes into possession of that media. To avoid this kind of access, confidential information should be encrypted before it leaves a computing device.

TLS

Transport Layer Security (TLS) is a protocol to offer security over network connections by making use of cryptography. TLS is the successor of the Secure Sockets Layer (SSL) which has been deprecated because of serious flaws. TLS has also evolved a couple of times in order to adapt itself and become more secure, thus it’s current version is 1.3. It can provide both privacy, and authenticity by making use of what is called symmetric and public-key cryptography. By saying that, we mean that once in use, you can be sure that nobody will be able to eavesdrop or alter your communication with that server during that session. The most important lesson here is recognizing that a website is trustworthy. You should look for the “lock” symbol on the browser’s address bar. If you desire, you can click on it to inspect the certificate that plays an important role in the HTTPS protocol. TLS is what is used on the HTTPS protocol (HTTP over TLS) in order to make it possible to send sensitive data (like your credit card number) through the web. Explaining how TLS works goes way beyond the purpose of this article, but you can find more information on the Wikipedia and at the Mozilla wiki.

File and E-mail Encryption With GnuPG

There are plenty of tools for securing emails but one of the most important of them is certainly GnuPG. GnuPG stands for GNU Privacy Guard and it is an open source implementation of OpenPGP which is an international standard codified within RFC 4880. GnuPG can be used to sign, encrypt, and decrypt texts, e-mails, files, directories, and even whole disk partitions. It works with public-key cryptography and is widely available. In a nutshell GnuPG creates a pair of files which contain your public and private keys. As the name implies, the public key can be available to anyone and the private key needs to be kept in secret. People will use your public key to encrypt data which only your private key will be able to decrypt. You can also use your private key to sign any file or e-mail which can be validated against the corresponding public key. This digital signage works analogous to the real world signature. As long as you are the only one who posses your private key, the receiver can be sure that it was you who have authored it. By making use of the cryptographic hash functionality GnuPG will also guarantee no changes have been made after the signature because any changes to the content would invalidate the signature. GnuPG is a very powerful tool and, in a certain extent, also a complex one. You can find more information on its website and on Archlinux wiki (Archlinux wiki is a very good source of information, even though you don’t use Archlinux)

Disk Encryption

A good way to secure your data is to encrypt your whole disk or partition. There are many open source softwares you can use to achieve such a purpose. How they work and what level of encryption they offer also varies significantly. There are basic two methods available: stacked and block device encryption. Stacked filesystem solutions are implemented on top of existing filesystem. When using this method, the files and directories are encrypted before being stored on the filesystem and decrypted after reading them. This means the files are stored on the host filesystem in an encrypted form (meaning that their contents, and usually also their file/folder names, are replaced by random-looking data), but other than that, they still exist in that filesystem as they would without encryption, as normal files, symlinks, hardlinks, etc. On the other hand, block device encryption happens below the filesystem layer, making sure everything that is written to a block device is encrypted. If you look to the block while it’s offline, it will look like a large section of random data and you won’t even be able to tell what type of filesystem is there without decrypting it first. This means you can’t tell what is a file or directory; how big they are and what kind of data it is, because metadata, directory structure and permissions are also encrypted. Both methods have their own pros and cons. Among all the options available, you should take a look at dm-crypt, which is the de-facto standard for block encryption for Linux systems, since it’s native in the kernel. It can be used with LUKS (Linux Unified Key Setup) extension, which is a specification that implements a platform-independent standard for use with various tools. If you want to try a stackable method, you should take a look at EncFS, which is probably the easiest way to secure data on Linux because it does not require root privileges to implement and it can work on an existing filesystem without modifications. Finally, if you need to access data on various platforms, check out Veracrypt. It is the successor of a Truecrypt and allows the creation of encrypted media and files, which can be used on Linux as well as on macOS and Windows.

2.1 Lesson 1

Finding Your Way on a Linux System - Command Line Basics

Modern Linux distributions have a wide range of graphical user interfaces but an administrator will always need to know how to work with the command line, or shell as it is called. The shell is a program that enables text based communication between the operating system and the user. It is usually a text mode program that reads the user’s input and interprets it as commands to the system

There are several different shells on Linux, these are just a few:

Bourne-again shell (Bash)

C shell (csh or tcsh, the enhanced csh)

Korn shell (ksh)

Z shell (zsh)

On Linux the most common one is the Bash shell. This is also the one that will be used in examples or exercises here.

When using an interactive shell, the user inputs commands at a so-called prompt. For each Linux distribution, the default prompt may look a little different, but it usually follows this structure:

username@hostname current_directory shell_type

username

Name of the user that runs the shell

hostname

Name of the host on which the shell runs. There is also a command can show or set the system’s host name.

current_directory

The directory that the shell is currently in. A ~ means that the shell is in the current user’s home directory.

shell_type

$ indicates the shell is run by a regular user.

# indicates the shell is run by the superuser root.

As we do not need any special privileges, we will use an unprivileged prompt in the following examples. For brevity, we will just use the $ as prompt.

Command Line Structure

Most commands at the command line follow the same basic structure:

command [option(s)/parameter(s)...] [argument(s)...]

Take the following command as an example:

$ ls -l /home

Let’s explain the purpose of each component

Command

Program that the user will run –ls in the above example.

Option(s)/Parameter(s)

A “switch” that modifies the behavior of the command in some way, such as -l in the above example. Options can be accessed in a short and in a long form. For example, --format=long

Multiple options can be combined as well and for the short form, the letters can usually be typed together. For example, the following commands all do the same:

$ ls -al

ls -a -l

ls --all --format=long

Argument(s)

Additional data that is required by the program, like a filename or path, such as /home in the above example.

The only mandatory part of this structure is the command itself. In general, all other elements are optional, but a program may require certain options, parameters or arguments to be specified.

Most commands display a short overview of available options when they are run with the --help parameter. We will learn additional ways to learn more about

Command Behavior Types

The shell supports two types of commands:

Internal

These commands are part of the shell itself and are not separate programs. There are around 30 such commands. Their main purpose is executing tasks inside the shell (e.g. export).

External

These commands reside in individual files. These files are usually binary programs or scripts. When a command which is not a shell builtin is run, the shell uses the PATH variable to search for an executable file with same name as the command. In addition to programs which are installed with the distribution’s package manager, users can create their own external commands as well

The command type shows what type a specific command is:

$ type echo

echo is a shell builtin

$ type man

man is /usr/bin/man

Quoting

As a Linux user, you will have to create or manipulate files or variables in various ways. This is easy when working with short filenames and single values, but it becomes more complicated when, for example, spaces, special characters and variables are involved. Shells provide a feature called quoting which encapsulates such data using various kinds of quotes (" ", ' '). In Bash, there are three types of quotes:

Double quotes

Single quotes

Escape characters

For example, the following commands do not act in the same way due to quoting: